algorithm output
CARE: Turning LLMs Into Causal Reasoning Expert
Dong, Juncheng, Liu, Yiling, Aloui, Ahmed, Tarokh, Vahid, Carlson, David
Large language models (LLMs) have recently demonstrated impressive capabilities across a range of reasoning and generation tasks. However, research studies have shown that LLMs lack the ability to identify causal relationships, a fundamental cornerstone of human intelligence. We first conduct an exploratory investigation of LLMs' behavior when asked to perform a causal-discovery task and find that they mostly rely on the semantic meaning of variable names, ignoring the observation data. This is unsurprising, given that LLMs were never trained to process structural datasets. To first tackle this challenge, we prompt the LLMs with the outputs of established causal discovery algorithms designed for observational datasets. These algorithm outputs effectively serve as the sufficient statistics of the observation data. However, quite surprisingly, we find that prompting the LLMs with these sufficient statistics decreases the LLMs' performance in causal discovery. To address this current limitation, we propose CARE, a framework that enhances LLMs' causal-reasoning ability by teaching them to effectively utilize the outputs of established causal-discovery algorithms through supervised fine-tuning. Experimental results show that a finetuned Qwen2.5-1.5B model produced by CARE significantly outperforms both traditional causal-discovery algorithms and state-of-the-art LLMs with over a thousand times more parameters, demonstrating effective utilization of its own knowledge and the external algorithmic clues.
QMRNet: Quality Metric Regression for EO Image Quality Assessment and Super-Resolution
Berga, David, Gallés, Pau, Takáts, Katalin, Mohedano, Eva, Riordan-Chen, Laura, Garcia-Moll, Clara, Vilaseca, David, Marín, Javier
Latest advances in Super-Resolution (SR) have been tested with general purpose images such as faces, landscapes and objects, mainly unused for the task of super-resolving Earth Observation (EO) images. In this research paper, we benchmark state-of-the-art SR algorithms for distinct EO datasets using both Full-Reference and No-Reference Image Quality Assessment (IQA) metrics. We also propose a novel Quality Metric Regression Network (QMRNet) that is able to predict quality (as a No-Reference metric) by training on any property of the image (i.e. its resolution, its distortions...) and also able to optimize SR algorithms for a specific metric objective. This work is part of the implementation of the framework IQUAFLOW which has been developed for evaluating image quality, detection and classification of objects as well as image compression in EO use cases. We integrated our experimentation and tested our QMRNet algorithm on predicting features like blur, sharpness, snr, rer and ground sampling distance (GSD) and obtain validation medRs below 1.0 (out of N=50) and recall rates above 95\%. Overall benchmark shows promising results for LIIF, CAR and MSRN and also the potential use of QMRNet as Loss for optimizing SR predictions. Due to its simplicity, QMRNet could also be used for other use cases and image domains, as its architecture and data processing is fully scalable.
Long-term Data Sharing under Exclusivity Attacks
Gafni, Yotam, Tennenholtz, Moshe
The quality of learning generally improves with the scale and diversity of data. Companies and institutions can therefore benefit from building models over shared data. Many cloud and blockchain platforms, as well as government initiatives, are interested in providing this type of service. These cooperative efforts face a challenge, which we call ``exclusivity attacks''. A firm can share distorted data, so that it learns the best model fit, but is also able to mislead others. We study protocols for long-term interactions and their vulnerability to these attacks, in particular for regression and clustering tasks. We conclude that the choice of protocol, as well as the number of Sybil identities an attacker may control, is material to vulnerability.
Cybersecurity AI: Integrating artificial intelligence into your security policy
Artificial intelligence is already being deployed and applied in a wide range of situations to boost productivity, increase sales, or improve user experiences. One area that AI use is still in its infancy is cybersecurity. Yet, at a time when hackers' ability to commit fraud and cause harm is more sophisticated than it's ever been, leveraging every tool is paramount if you want to stay ahead of the curve. In addition, the average enterprise is seeing a steady growth in the number and type of users, devices, networks, and interfaces thanks to great strides made in cloud computing, the Internet of Things, 5G, network speeds, data volume, and other contemporary technologies. When deployed in concert with other defensive mechanisms, AI can be a powerful weapon against cyberattacks.
Stable Manipulation in Voting
Voting has served as a fundamental tool for aggregating preferences of a set of people over a set of alternatives from centuries. A typical voting system consists of a set of alternatives, a set of voters each having a linear order over the set of alternatives as her preference, and a voting rule which selects a set of alternatives as winners depending on the voters' preferences. However, classical results show that every reasonable voting system with at least 3 alternatives can suffer from manipulation [Gib73, Sat75] -- an agent may be able to make her more favored alternative win by misreporting her preference. Bartholdi et al. pioneered the idea of using computational intractability as a barrier to safe guard elections against manipulation [BTT89, BO91]. Indeed, if we have m alternatives and even if the manipulator exactly knows the preferences of all other voters, na ıvely going over all ( m! 1) possible preferences and report the one which results in best outcome for the manipulator is not feasible for any computationally bounded manipulator . Although the idea of Bartholdi et al. was to use computational intractability as a barrier against manipulation, the computational problem of manipulation turns out to be polynomial time solvable for most of the commonly used voting rules such as the scoring rules, maximin, Copeland, etc. with prominent exception being the single transferable (STV) voting rule. Even for voting rules (STV for example) for which the computational barrier exists against manipulation, it seems that the barrier, in reality, may be substantially weak due to existence of heuristics which work well in practice [FP10, FKKN11, MR15, and references there in]. Motivation: The computational problem of manipulation has mostly been studied in what is called the complete information setting -- the manipulator exactly knows the preferences of all other voters.
Entropic Causality and Greedy Minimum Entropy Coupling
Kocaoglu, Murat, Dimakis, Alexandros G., Vishwanath, Sriram, Hassibi, Babak
We study the problem of identifying the causal relationship between two discrete random variables from observational data. We recently proposed a novel framework called entropic causality that works in a very general functional model but makes the assumption that the unobserved exogenous variable has small entropy in the true causal direction. This framework requires the solution of a minimum entropy coupling problem: Given marginal distributions of m discrete random variables, each on n states, find the joint distribution with minimum entropy, that respects the given marginals. This corresponds to minimizing a concave function of nm variables over a convex polytope defined by nm linear constraints, called a transportation polytope. Unfortunately, it was recently shown that this minimum entropy coupling problem is NP-hard, even for 2 variables with n states. Even representing points (joint distributions) over this space can require exponential complexity (in n, m) if done naively. In our recent work we introduced an efficient greedy algorithm to find an approximate solution for this problem. In this paper we analyze this algorithm and establish two results: that our algorithm always finds a local minimum and also is within an additive approximation error from the unknown global optimum.